[00:08.030 --> 00:12.650]  Hello, everyone. Thank you for taking the time to listen to my speech.
[00:12.650 --> 00:15.650]  I am Li Kang, from Georgia University.
[00:15.990 --> 00:21.370]  Today, I am going to talk about some security issues related to cyber security applications.
[00:21.750 --> 00:24.990]  I have been working on this for a while.
[00:25.650 --> 00:29.730]  I have been working with the cybersecurity team of 360 in Beijing.
[00:31.350 --> 00:33.890]  Let me briefly introduce myself.
[00:33.890 --> 00:39.150]  In the past, if I was not in the cybersecurity circle or in the CTF competition,
[00:39.150 --> 00:40.710]  you would know me better.
[00:40.710 --> 00:46.490]  Because I used to work in a lot of security team CTF competitions.
[00:47.950 --> 00:52.970]  Personally, I was very honored to return to the mainland a decade ago
[00:52.970 --> 00:57.210]  and work with Mr. Duan Aixin and Mr. Zhuo Jianwei from Tsinghua University,
[01:00.270 --> 01:03.390]  the famous Blue Lotus team of Tsinghua University.
[01:03.390 --> 01:07.730]  At that time, I really spent a lot of time to get Mr. Duan down.
[01:08.130 --> 01:12.440]  The results of the Blue Lotus team were also good.
[01:12.440 --> 01:16.740]  And I was very honored to get the title of Blue Lotus team's chief instructor.
[01:17.180 --> 01:21.180]  In addition to the Blue Lotus team, I also worked with some other teams in the past.
[01:21.180 --> 01:24.180]  For example, if you are older than the CTF team,
[01:24.180 --> 01:31.720]  you may have heard of our Dyson team and Saigon team.
[01:31.720 --> 01:40.420]  The Dyson team also made it to the final of the security super challenge.
[01:41.220 --> 01:41.960]  In fact, there was a guest yesterday, Mito,
[01:41.960 --> 01:47.580]  who talked a lot about the whole CTF and the Cyber Grand Challenge.
[01:52.120 --> 01:57.500]  Today, I would like to talk more about the security issues of AI and AI.
[01:58.940 --> 02:04.860]  What I talked about in the past was more about traditional security analysis,
[02:04.860 --> 02:07.300]  such as communication and use.
[02:07.380 --> 02:16.040]  But I started to talk about the security of AI two years ago.
[02:16.040 --> 02:19.740]  At the end of 2016, I gave a speech in Shanghai.
[02:19.760 --> 02:24.640]  At that time, I had a student, a student I trained in the United States.
[02:25.080 --> 02:28.320]  I gave a speech at a famous Blue Lotus company in Hangzhou.
[02:28.780 --> 02:38.740]  After that, I communicated with them.
[02:38.740 --> 02:42.180]  I haven't seen my student for many years.
[02:42.180 --> 02:45.760]  His team member was very smart.
[02:45.940 --> 02:51.960]  He told me some words and started our work.
[02:51.960 --> 02:53.200]  What did he say?
[02:53.740 --> 03:00.000]  He told me, Mr. Li Kang, you used to do loopholes, software, and CTF competitions.
[03:00.340 --> 03:02.900]  This direction is very good.
[03:02.900 --> 03:04.980]  But what you do is outdated.
[03:04.980 --> 03:07.780]  What you do is called classical Internet security.
[03:08.940 --> 03:11.360]  Now we all use AI to do security.
[03:11.360 --> 03:14.320]  Now we all need to use AI to do security.
[03:14.320 --> 03:18.420]  This is to tell you why we started to do security with AI.
[03:20.560 --> 03:22.860]  I put some examples in this presentation.
[03:22.860 --> 03:24.940]  I don't need to tell you too much.
[03:24.940 --> 03:29.440]  Because there are too many examples of AI applications now.
[03:29.900 --> 03:33.040]  I only put the typical automatic driving.
[03:38.120 --> 03:43.520]  Some researchers use this deep learning method to learn the artist's style.
[03:43.520 --> 03:46.300]  Then we can turn this photo into an artist's style.
[03:47.500 --> 03:51.520]  Why do we care about AI security?
[03:51.520 --> 03:54.940]  I want to give you two reasons.
[03:55.040 --> 03:57.780]  The first one may be very easy to understand.
[03:58.620 --> 04:03.260]  Many guests have mentioned this before, including Dr. Guitao.
[04:04.860 --> 04:11.680]  Many deep learning AI applications can be used in very important systems.
[04:11.680 --> 04:12.980]  For example, automatic driving.
[04:13.000 --> 04:15.920]  But it is a matter of life and death.
[04:15.920 --> 04:21.000]  So from this perspective, it should be easy for everyone to understand.
[04:22.280 --> 04:25.160]  We need to study the security of AI systems.
[04:26.040 --> 04:29.620]  For example, I don't need to tell you too much.
[04:29.620 --> 04:37.480]  Because everyone knows that Tesla has caused many casualties in the past.
[04:37.800 --> 04:41.080]  I think it was last year that it was finalized.
[04:41.080 --> 04:49.400]  In fact, it is the first time in the world that an accident occurred due to the death of an employee due to automatic driving.
[04:51.100 --> 04:54.200]  This is how I put the screenshot in the news later.
[04:54.200 --> 05:00.420]  It should be because an automatic driver hit a car on the highway.
[05:00.500 --> 05:06.080]  At that time, Tesla directly reported that they were not sure if the driver was driving.
[05:06.080 --> 05:09.100]  It can't be said that people really didn't pay attention.
[05:09.100 --> 05:15.640]  It just so happens that we have a vehicle recording machine and some other live information.
[05:15.640 --> 05:19.520]  So in the end, it was finally confirmed that it was the death of the driver.
[05:20.120 --> 05:22.880]  There are many other examples. I will mention one more.
[05:22.880 --> 05:29.580]  In March this year, there was an accident at Uber in Arizona.
[05:30.780 --> 05:32.400]  It was a car with automatic driving.
[05:33.300 --> 05:37.560]  In the evening, because the light was relatively dark,
[05:37.560 --> 05:44.020]  the picture above shows that a pedestrian was pushing a bicycle across the road.
[05:44.020 --> 05:50.440]  The Uber car did not slow down, causing injuries.
[05:50.440 --> 05:53.040]  The person died in the hospital that day.
[05:53.500 --> 05:54.780]  So this is also an example.
[05:54.780 --> 05:58.320]  Later, we analyzed a situation that I saw at the time.
[05:58.560 --> 06:03.940]  At that time, other automatic driving systems were not developed.
[06:03.940 --> 06:05.660]  It was mainly dependent on the visual system.
[06:05.660 --> 06:07.420]  At that time, the visual system was light-sensitive.
[06:07.500 --> 06:12.920]  After the video came in, the physical recognition was not very good.
[06:14.020 --> 06:17.180]  Just now, I think it is easy for everyone to understand.
[06:17.760 --> 06:20.060]  We need to pay attention to safety in deep learning and artificial intelligence systems.
[06:20.700 --> 06:24.300]  But in fact, my original intention is more of another reason.
[06:24.500 --> 06:25.120]  What is the reason?
[06:25.500 --> 06:28.300]  I used to work in the defense industry.
[06:30.080 --> 06:31.780]  I have always had a view.
[06:31.780 --> 06:35.280]  In fact, many people in our circle, including Mr. Yutao, also have the same view.
[06:35.280 --> 06:36.980]  What is the nature of security?
[06:37.440 --> 06:39.200]  The nature of security is confrontation.
[06:39.280 --> 06:45.120]  We are considering deep learning, machine learning, or various artificial intelligence.
[06:45.120 --> 06:52.520]  When it comes to the critical system, we actually need to consider how to confront it.
[06:52.520 --> 06:55.160]  That is to say, your opponent may also use artificial intelligence.
[06:55.160 --> 06:58.040]  Or there is an artificial intelligence system that is not good for you.
[06:58.040 --> 07:01.100]  You must find a way to bypass it.
[07:01.100 --> 07:02.280]  The first example.
[07:02.280 --> 07:04.780]  This is the information I found in the domestic news.
[07:05.700 --> 07:08.780]  That is to say, the human face recognition operation scene.
[07:08.780 --> 07:13.160]  An operation scene is to move the human face recognition to the school environment.
[07:13.160 --> 07:15.460]  Because I have been in the university for many years.
[07:15.460 --> 07:18.060]  I miss my university very much.
[07:19.720 --> 07:22.800]  When I was in the National University of China, I got up early in the morning.
[07:22.800 --> 07:25.920]  I want to leave early in the morning.
[07:25.920 --> 07:28.360]  What does this human face recognition system do?
[07:28.360 --> 07:30.720]  It can monitor early and late.
[07:31.020 --> 07:33.340]  Students skip classes and change classes.
[07:33.340 --> 07:36.120]  Including whether you look up and listen to the teacher.
[07:36.540 --> 07:38.180]  I just thought about it.
[07:38.180 --> 07:42.860]  When I was a student, I was a very, very good student.
[07:42.860 --> 07:47.000]  I would like to say that if the human face recognition is really used in this scene.
[07:47.000 --> 07:49.380]  I will definitely think about how to avoid this thing.
[07:49.420 --> 07:51.760]  Of course, this is not a very serious example.
[07:51.760 --> 07:53.720]  Some more realistic examples.
[07:53.760 --> 08:01.040]  What is shown in this picture is a machine that we call a brush sticker on the Internet.
[08:01.140 --> 08:03.780]  The picture I put up is the big picture of the North American version.
[08:04.040 --> 08:05.100]  This is Yelp.
[08:06.640 --> 08:10.100]  There are often some fake comments on the Internet.
[08:10.100 --> 08:14.480]  As long as it is to increase or decrease the reputation of the upper class.
[08:14.480 --> 08:17.560]  In fact, you can buy these comments directly on the Internet.
[08:17.940 --> 08:20.700]  And in this new system.
[08:20.900 --> 08:22.760]  According to the researchers' analysis.
[08:22.920 --> 08:27.500]  They found that they used deep learning and R&N to learn the past.
[08:27.580 --> 08:29.500]  The post written by others.
[08:29.500 --> 08:33.420]  Later, it was found that the post was generated by the machine.
[08:33.420 --> 08:35.520]  It is more difficult to let people review it.
[08:35.900 --> 08:37.140]  This is an example.
[08:37.140 --> 08:37.820]  There is another one.
[08:37.820 --> 08:40.080]  I believe everyone has heard of it.
[08:40.080 --> 08:42.320]  There is a company called Dabaji.
[08:42.380 --> 08:43.320]  What does Dabaji do?
[08:43.680 --> 08:44.760]  We have a lot of applications.
[08:44.760 --> 08:46.200]  Now it is to prevent robots.
[08:46.200 --> 08:47.800]  We put this picture.
[08:47.960 --> 08:49.400]  This picture is also used.
[08:50.040 --> 08:53.360]  Then the purpose is to ensure that people can log in.
[08:53.360 --> 08:57.580]  Then the machine may be more tight and stable.
[08:57.580 --> 09:00.840]  It is said that there is a software in China called 123.6.
[09:00.840 --> 09:04.300]  When you run Dabaji, people can't recognize it.
[09:04.320 --> 09:08.440]  But now it is said that the situation on the Internet is that you buy this Dabaji.
[09:08.440 --> 09:09.940]  Dabaji can handle everything.
[09:09.940 --> 09:10.860]  Including Google.
[09:10.860 --> 09:13.240]  Google's more complex is ReCAPTCHA.
[09:13.240 --> 09:14.280]  It can do everything.
[09:14.360 --> 09:21.520]  So this example also says that if artificial intelligence is used, it will also be used by the black market.
[09:21.520 --> 09:24.140]  So you must consider the scene of confrontation.
[09:24.200 --> 09:25.020]  The last example.
[09:25.020 --> 09:28.240]  This is also a friend I met at a meeting in China a while ago.
[09:28.900 --> 09:32.700]  He is a staff member of a research institute in China.
[09:32.700 --> 09:37.480]  In his spare time, he does a handwritten notebook simulation.
[09:37.480 --> 09:39.040]  This is not a printer.
[09:39.040 --> 09:42.460]  He does a real man's notebook to write.
[09:42.820 --> 09:45.040]  He learns first.
[09:45.040 --> 09:46.760]  For example, you write a few words.
[09:47.380 --> 09:53.180]  After he learns your writing, he can let the machine write the words you have written.
[09:53.280 --> 09:55.160]  Just like we write a few words in human words.
[09:55.160 --> 09:56.300]  In fact, every time we write, it is different.
[09:56.300 --> 09:57.380]  He writes a word.
[09:57.400 --> 10:01.680]  Then I think this should be in our GIFON competition.
[10:01.680 --> 10:04.080]  Ask a notebook expert to identify it.
[10:04.080 --> 10:05.660]  Let one person write, for example, ten words.
[10:05.660 --> 10:08.440]  Let the machine write ten words and mix them together.
[10:08.440 --> 10:10.580]  The notebook expert can't identify it.
[10:10.580 --> 10:16.400]  So in conclusion, this is our motivation.
[10:16.400 --> 10:20.280]  Why do we want to study the language of AI?
[10:20.320 --> 10:24.670]  Back to the question, how do we do the AI confrontation?
[10:25.820 --> 10:30.880]  In fact, Witte mentioned the things we did in the past.
[10:30.880 --> 10:32.300]  So I will give you the first example.
[10:32.300 --> 10:35.600]  I want to use an application of our goal.
[10:35.600 --> 10:40.420]  Let's say we want to attack a human face recognition system.
[10:40.420 --> 10:42.780]  Or in this example, it may be a cat face recognition system.
[10:45.220 --> 10:49.720]  If you are most concerned about how to describe this in the news report.
[10:49.720 --> 10:52.260]  What kind of attack is now to attack this thing?
[10:52.620 --> 10:56.560]  I didn't deny the idea of ​​news celebrities.
[10:57.640 --> 10:58.800]  Let me give you an example.
[10:58.800 --> 11:00.840]  In fact, this is a famous online magazine in the United States.
[11:00.840 --> 11:05.200]  When Apple first released Face ID.
[11:05.200 --> 11:08.080]  They talked about how to attack Face ID.
[11:08.300 --> 11:10.120]  Use the human face mask.
[11:10.220 --> 11:14.420]  And the result of the summary was that it was not very effective to attack Face ID with a human face mask.
[11:15.140 --> 11:20.160]  But in fact, as software security personnel,
[11:20.160 --> 11:23.880]  to attack the deep learning system, to attack an AI system,
[11:23.880 --> 11:27.040]  I think everyone's thinking can be a little wider.
[11:27.180 --> 11:30.360]  In fact, we have talked about a lot of things.
[11:30.360 --> 11:37.640]  For example, the picture I put here should be a recent example of what the team often uses.
[11:37.640 --> 11:39.720]  For example, I have these pictures.
[11:39.720 --> 11:42.860]  But if I add some interference to these pictures,
[11:42.860 --> 11:45.920]  it will lead to a misunderstanding.
[11:45.920 --> 11:52.240]  So the attack on Face ID should now be a more popular example of artificial intelligence security.
[11:54.100 --> 11:57.280]  If you meet a researcher of artificial intelligence security,
[11:57.280 --> 11:58.900]  he said he was doing artificial intelligence security.
[11:58.900 --> 12:01.760]  I understand that more than 90% of the time,
[12:01.760 --> 12:04.100]  he may be doing artificial intelligence security.
[12:04.240 --> 12:06.120]  It's a kind of confrontation attack.
[12:06.120 --> 12:12.560]  Today, I think there are still a lot of people who pay attention to the security background.
[12:12.560 --> 12:13.500]  I want to talk about this.
[12:13.500 --> 12:17.440]  In fact, I personally think that if a real deep learning system is attacked,
[12:17.440 --> 12:19.320]  if it is broken in reality,
[12:19.320 --> 12:22.140]  I think it may not be artificial intelligence security to a large extent,
[12:22.140 --> 12:23.660]  but for other reasons.
[12:23.660 --> 12:26.380]  Let's go back and talk about some examples.
[12:28.540 --> 12:31.640]  I'm talking about what it means to go beyond the confrontation.
[12:32.260 --> 12:35.040]  The first example was just mentioned by Dr. Vitala and Dr. Mehta.
[12:37.280 --> 12:40.640]  In fact, in the summer of last year,
[12:40.640 --> 12:42.140]  I don't know if it was too much time,
[12:42.140 --> 12:47.580]  the team probably found more than a dozen CREs.
[12:47.580 --> 12:49.900]  In fact, there may be more than 20 loopholes.
[12:49.900 --> 12:50.860]  There are more than a dozen.
[12:51.800 --> 12:54.740]  This list has also been widely circulated.
[12:54.740 --> 12:59.200]  In fact, I am now taking this opportunity to promote the team of 360 IQ.
[12:59.560 --> 13:01.840]  What can we do with these loopholes?
[13:02.180 --> 13:06.120]  I will give you a few examples below.
[13:06.940 --> 13:11.460]  One of the main reasons these loopholes can be attacked is because
[13:11.460 --> 13:13.500]  the application of deep learning now
[13:14.280 --> 13:18.340]  is actually dependent on a three-layer framework.
[13:18.340 --> 13:20.820]  When we first applied deep learning,
[13:20.820 --> 13:23.200]  we were basically working in the upper layer of deep learning.
[13:24.540 --> 13:29.140]  We have our own model, including training and training parameters.
[13:29.360 --> 13:32.000]  But most of it depends on these development frameworks.
[13:32.000 --> 13:33.600]  We have to put these commonly used ones here.
[13:33.600 --> 13:35.680]  The sea side is PythonFlow, right?
[13:35.700 --> 13:37.000]  Then we have PyTorch here.
[13:38.000 --> 13:43.540]  All of these frameworks are actually dependent on a large number of third-party tools.
[13:43.540 --> 13:46.440]  Because we don't have to re-create a lot of words.
[13:46.440 --> 13:49.540]  I don't have to say that I'm going to use Python to write.
[13:49.540 --> 13:52.040]  We have a library like Py here.
[13:52.040 --> 13:53.820]  We have a library like CV.
[13:54.040 --> 13:57.620]  We have a library like Google Protobuf for model description.
[13:59.380 --> 14:03.120]  If any of these libraries have a problem,
[14:03.120 --> 14:05.720]  it may lead to a problem with the application above.
[14:05.720 --> 14:08.880]  So that's why we just found some loopholes.
[14:08.880 --> 14:10.760]  Why does it affect a lot of applications?
[14:10.760 --> 14:12.480]  It's mainly because of this reliance.
[14:12.720 --> 14:15.600]  Let me give you a brief introduction.
[14:15.600 --> 14:18.220]  This is the attack we reported last year.
[14:18.260 --> 14:19.980]  It's not the main topic today.
[14:19.980 --> 14:21.500]  But I'm going to talk about this joke.
[14:21.500 --> 14:23.100]  The target of our attack, for example,
[14:23.100 --> 14:28.060]  we used a very classic example in picture recognition.
[14:28.060 --> 14:31.020]  Because everyone knows that there is a very famous game in picture recognition.
[14:31.020 --> 14:32.300]  It's called ImageNet.
[14:32.360 --> 14:35.600]  If you download a lot of picture frameworks now,
[14:35.600 --> 14:38.540]  it will take the application written by others.
[14:39.060 --> 14:42.840]  Of course, how well does a deep learning application do?
[14:42.840 --> 14:47.080]  It's largely dependent on its training data and its model.
[14:47.080 --> 14:49.040]  So when we attacked it,
[14:49.040 --> 14:52.700]  we used a model that was trained by others.
[14:52.700 --> 14:58.220]  In my impression, it was a GoogleNet model from Google.
[14:58.220 --> 15:00.060]  We used ImageNet to train the data.
[15:00.060 --> 15:02.800]  All the data was downloaded from the Google website.
[15:03.120 --> 15:04.840]  In such an application, you give a picture,
[15:04.840 --> 15:06.400]  you run the program next to it,
[15:06.400 --> 15:09.760]  and the result will tell you that there is a cat in this picture.
[15:09.760 --> 15:10.520]  What kind of cat?
[15:10.520 --> 15:11.780]  It will tell you these details.
[15:12.520 --> 15:15.580]  These labels are in its training data.
[15:15.900 --> 15:18.920]  When we attack, we choose such a scene.
[15:19.120 --> 15:20.720]  In order to show this attack,
[15:20.720 --> 15:23.740]  what is the main reason for doing this attack?
[15:24.020 --> 15:26.400]  We told the developers, including the developers of Kafei,
[15:26.400 --> 15:27.820]  that you have a problem.
[15:27.840 --> 15:29.120]  The developers of Kafei said,
[15:29.120 --> 15:30.480]  first, it's none of my business.
[15:30.480 --> 15:33.000]  Second, even if you have a problem, I don't care.
[15:33.000 --> 15:34.980]  It's just that your program is broken.
[15:35.300 --> 15:36.460]  We gave this example.
[15:36.460 --> 15:37.660]  What does this example mean?
[15:37.660 --> 15:39.040]  I put four pictures.
[15:39.040 --> 15:40.580]  The picture in the upper left corner,
[15:40.580 --> 15:43.020]  I grabbed the picture from the Internet.
[15:44.560 --> 15:46.640]  And then, the other three pictures,
[15:46.640 --> 15:48.420]  we changed the picture.
[15:48.840 --> 15:50.400]  If you look closely,
[15:50.400 --> 15:51.700]  I don't know how this effect is.
[15:51.700 --> 15:52.360]  It looks like,
[15:52.360 --> 15:53.960]  you can actually see that
[15:53.960 --> 15:55.820]  these pictures look
[15:55.820 --> 15:58.180]  different from the original pictures.
[15:58.680 --> 16:00.040]  What I want to emphasize is that
[16:00.040 --> 16:02.960]  I didn't do a confrontation or an attack here.
[16:02.960 --> 16:07.220]  We didn't do this pixel-based attack on this picture.
[16:07.220 --> 16:08.220]  What we changed is that
[16:08.220 --> 16:09.020]  in the picture,
[16:09.020 --> 16:10.520]  other metadata,
[16:10.520 --> 16:12.100]  some of its colors,
[16:12.100 --> 16:13.700]  the information of the color palette.
[16:13.940 --> 16:15.200]  So it looks like
[16:15.200 --> 16:17.240]  the effect may have changed a little bit.
[16:17.240 --> 16:18.620]  But the real reason for the change
[16:18.620 --> 16:22.140]  is that we actually added a malicious code in it.
[16:22.580 --> 16:24.780]  Or with our CTM,
[16:24.780 --> 16:26.180]  security personnel,
[16:26.180 --> 16:28.700]  we added a code here.
[16:28.700 --> 16:30.140]  So these four pictures,
[16:30.140 --> 16:32.140]  I threw them to the little mouse just now.
[16:32.140 --> 16:32.820]  It's GoogleNet.
[16:32.820 --> 16:34.060]  Because it's in the first place
[16:34.060 --> 16:36.120]  in the application of deep learning.
[16:36.120 --> 16:38.420]  What will it look like?
[16:39.160 --> 16:40.820]  These are the four results of the operation.
[16:40.820 --> 16:42.180]  Of course, I don't expect everyone to say
[16:42.180 --> 16:43.280]  that every word can be seen clearly.
[16:43.280 --> 16:44.420]  That's not what I mean.
[16:44.420 --> 16:46.780]  But I use pictures to show the four results.
[16:46.780 --> 16:47.360]  The first result,
[16:47.360 --> 16:48.360]  the original picture,
[16:48.360 --> 16:49.280]  thrown in,
[16:49.280 --> 16:50.400]  of course, it's still the same.
[16:50.400 --> 16:51.840]  Because we didn't make any changes.
[16:51.840 --> 16:55.140]  Anyone can learn to recognize these pictures.
[16:55.140 --> 16:56.020]  No problem.
[16:56.860 --> 16:58.000]  The second picture,
[16:58.000 --> 16:59.400]  the one in the lower left corner,
[16:59.400 --> 17:00.360]  we threw this picture over.
[17:00.360 --> 17:03.060]  In fact, it just triggered a loophole.
[17:03.060 --> 17:04.620]  After triggering, we didn't do anything.
[17:04.620 --> 17:05.940]  That caused the program to crash.
[17:05.940 --> 17:07.440]  We call it a cyclotation fault.
[17:07.900 --> 17:09.460]  A lot of errors.
[17:09.460 --> 17:12.220]  So the result is that
[17:12.220 --> 17:13.800]  this application did not output the result,
[17:13.800 --> 17:15.760]  but there was no other problem.
[17:15.760 --> 17:17.440]  I believe everyone knows that
[17:17.440 --> 17:18.300]  when we write a program,
[17:18.300 --> 17:19.840]  it's normal to have an error.
[17:21.280 --> 17:23.040]  The third example is that
[17:23.040 --> 17:24.060]  it's still the same program.
[17:24.060 --> 17:25.660]  We changed the third picture.
[17:25.720 --> 17:27.940]  The purpose is to tell everyone,
[17:27.940 --> 17:30.920]  especially those outside of security,
[17:30.920 --> 17:33.260]  that there is a loophole in the local program.
[17:33.260 --> 17:33.880]  And this loophole
[17:33.880 --> 17:35.880]  can be used to attack
[17:36.820 --> 17:38.440]  and make a bigger loophole.
[17:38.440 --> 17:39.320]  In this example,
[17:39.320 --> 17:40.200]  we actually achieved the purpose of
[17:40.200 --> 17:41.340]  control loop decryption.
[17:41.620 --> 17:43.320]  What do I mean by control loop decryption?
[17:43.440 --> 17:44.140]  That is to say,
[17:44.140 --> 17:45.640]  the original program is classified
[17:45.640 --> 17:48.140]  according to its machine learning method.
[17:48.140 --> 17:49.540]  When I do control loop decryption,
[17:49.540 --> 17:51.440]  I can let it run my program.
[17:51.440 --> 17:52.020]  That is to say,
[17:52.020 --> 17:54.300]  I control it to do anything,
[17:54.300 --> 17:55.300]  including output.
[17:55.300 --> 17:56.780]  So what do I want it to output?
[17:57.280 --> 17:58.680]  I just imagine that
[17:58.680 --> 18:00.100]  if I were an IT guy,
[18:00.100 --> 18:01.600]  what would I imagine?
[18:01.660 --> 18:03.360]  I want to become a little pig.
[18:03.360 --> 18:06.300]  So I want to turn it into a flagging pig.
[18:06.300 --> 18:07.320]  A scene of a little pig.
[18:08.400 --> 18:11.160]  The fourth one is actually the same as the example just now.
[18:11.160 --> 18:13.020]  Since we can do control loop decryption,
[18:13.020 --> 18:14.880]  in fact, in the fourth example,
[18:14.880 --> 18:15.800]  I generated a SHA.
[18:15.800 --> 18:17.280]  I actually showed that
[18:17.280 --> 18:19.440]  I can control your back end,
[18:19.440 --> 18:20.460]  run this end,
[18:20.460 --> 18:21.740]  then I can take your data,
[18:21.740 --> 18:23.040]  I can delete your data, right?
[18:23.040 --> 18:24.080]  I put it in an IOU,
[18:24.080 --> 18:25.420]  such a picture.
[18:27.280 --> 18:28.240]  In fact, just now,
[18:28.240 --> 18:28.920]  this is a brief introduction
[18:28.920 --> 18:30.380]  of our previous work.
[18:30.380 --> 18:32.120]  Next, we would like to introduce
[18:32.340 --> 18:33.040]  a new job,
[18:33.040 --> 18:34.240]  in fact, this is also
[18:34.240 --> 18:36.480]  what I call a crisis.
[18:37.260 --> 18:40.740]  In my student team,
[18:40.740 --> 18:42.260]  they commented on
[18:42.260 --> 18:43.100]  this classic,
[18:43.100 --> 18:44.180]  classic security,
[18:44.180 --> 18:45.720]  the things that young people can do.
[18:46.000 --> 18:47.080]  I think that
[18:47.080 --> 18:49.860]  although I am old,
[18:49.860 --> 18:51.000]  I can do classic security,
[18:51.000 --> 18:52.060]  but I think that thinking
[18:52.060 --> 18:53.400]  sometimes is still useful.
[18:53.540 --> 18:54.800]  Next, we will talk about
[18:55.000 --> 18:55.320]  a new,
[18:55.320 --> 18:56.780]  another attack method,
[18:56.780 --> 18:57.300]  which is called,
[18:57.300 --> 18:59.080]  we call it data attack,
[18:59.080 --> 19:00.180]  or sometimes we call it
[19:00.180 --> 19:00.860]  frequency attack,
[19:00.860 --> 19:01.520]  or price attack.
[19:01.520 --> 19:02.520]  What does it mean?
[19:03.040 --> 19:04.740]  To give the simplest example,
[19:04.740 --> 19:05.660]  we take a more classic
[19:06.200 --> 19:07.020]  deep learning
[19:07.020 --> 19:07.840]  artificial intelligence,
[19:07.840 --> 19:08.140]  what does it mean?
[19:08.140 --> 19:09.280]  It is face recognition.
[19:09.280 --> 19:10.480]  In face recognition,
[19:10.480 --> 19:11.340]  the example I put here
[19:11.340 --> 19:13.620]  is from Facebook.
[19:13.620 --> 19:14.500]  There is an application
[19:14.500 --> 19:14.860]  on Facebook called
[19:14.860 --> 19:15.800]  DeepFace.
[19:15.940 --> 19:17.120]  What is the extent
[19:17.120 --> 19:17.460]  of DeepFace?
[19:18.140 --> 19:18.600]  DeepFace's accuracy
[19:18.600 --> 19:19.520]  of face recognition
[19:20.640 --> 19:22.000]  is said to be
[19:22.000 --> 19:23.460]  about 97%
[19:23.460 --> 19:23.860]  of people.
[19:23.860 --> 19:24.460]  It should be,
[19:24.460 --> 19:24.920]  or you say
[19:24.920 --> 19:25.380]  close to people
[19:25.380 --> 19:26.280]  or more than people,
[19:26.280 --> 19:27.040]  almost,
[19:27.040 --> 19:27.920]  because the first
[19:27.920 --> 19:28.920]  DeepFace is very strong,
[19:28.920 --> 19:29.180]  in fact,
[19:29.180 --> 19:29.620]  the second
[19:30.320 --> 19:31.700]  DeepFace has a lot
[19:31.700 --> 19:32.500]  of data,
[19:32.500 --> 19:33.400]  so it does
[19:33.400 --> 19:34.020]  face recognition
[19:34.020 --> 19:34.880]  deep learning,
[19:34.880 --> 19:37.320]  and it is not strange
[19:37.320 --> 19:37.520]  at all.
[19:37.520 --> 19:38.820]  It is very easy
[19:38.820 --> 19:39.680]  to understand.
[19:39.900 --> 19:41.220]  Let's imagine,
[19:41.220 --> 19:41.840]  but in the end
[19:41.840 --> 19:42.540]  I didn't get
[19:42.540 --> 19:43.360]  the specific
[19:43.360 --> 19:43.980]  face recognition
[19:43.980 --> 19:45.500]  to do the test,
[19:45.500 --> 19:46.180]  but I hope
[19:46.740 --> 19:47.380]  you can understand
[19:48.040 --> 19:48.140]  after listening.
[19:48.140 --> 19:48.880]  Our attack
[19:48.880 --> 19:49.760]  should be
[19:49.760 --> 19:50.460]  on the top
[19:50.460 --> 19:51.760]  and can be used.
[19:51.760 --> 19:52.580]  Let's imagine
[19:53.020 --> 19:53.920]  face recognition.
[19:53.960 --> 19:55.600]  I send this picture
[19:55.600 --> 19:57.840]  to face recognition.
[19:58.240 --> 19:58.900]  I don't know
[19:58.900 --> 19:59.160]  if you can
[19:59.160 --> 20:00.120]  see it.
[20:01.040 --> 20:01.700]  Face recognition
[20:01.700 --> 20:02.240]  will think
[20:02.240 --> 20:03.360]  whose face
[20:03.360 --> 20:03.880]  it is.
[20:05.500 --> 20:06.220]  Of course,
[20:06.220 --> 20:07.080]  if you look at
[20:07.080 --> 20:08.080]  this picture
[20:08.080 --> 20:08.160]  carefully,
[20:08.160 --> 20:09.180]  you can see
[20:09.180 --> 20:10.340]  that it is not
[20:10.560 --> 20:11.080]  a very clear
[20:11.080 --> 20:11.720]  picture.
[20:11.720 --> 20:13.180]  It should be
[20:13.180 --> 20:14.520]  Li Bingbing's
[20:14.520 --> 20:14.860]  picture.
[20:14.860 --> 20:16.120]  It is an
[20:16.120 --> 20:17.200]  Asian celebrity
[20:18.040 --> 20:18.900]  picture.
[20:18.900 --> 20:19.500]  I send it
[20:19.500 --> 20:20.300]  to face recognition.
[20:20.300 --> 20:20.980]  AI system
[20:20.980 --> 20:21.420]  will think
[20:21.420 --> 20:22.720]  who it is.
[20:23.660 --> 20:24.040]  The answer
[20:24.040 --> 20:24.920]  is this.
[20:24.920 --> 20:25.580]  I send it
[20:25.580 --> 20:26.400]  to face recognition.
[20:26.400 --> 20:26.640]  The answer
[20:26.640 --> 20:26.720]  is this.
[20:26.720 --> 20:26.780]  then
[20:26.780 --> 20:26.960]  the AI
[20:26.960 --> 20:26.980]  system
[20:26.980 --> 20:27.300]  it is
[20:27.300 --> 20:28.260]  Zhao Zicheng.
[20:30.500 --> 20:31.380]  It is
[20:31.380 --> 20:31.900]  funny.
[20:31.900 --> 20:32.020]  And
[20:32.020 --> 20:32.060]  then
[20:32.060 --> 20:32.380]  it can
[20:32.380 --> 20:32.520]  think
[20:32.520 --> 20:32.580]  is the
[20:32.580 --> 20:32.780]  connection
[20:33.600 --> 20:34.040]  between
[20:35.340 --> 20:35.780]  a
[20:35.780 --> 20:36.080]  deep
[20:36.080 --> 20:36.600]  face
[20:36.600 --> 20:37.040]  recognition
[20:47.920 --> 20:48.800]  picture.
[20:53.640 --> 20:54.640]  It
[20:54.640 --> 20:55.640]  will
[20:50.860 --> 20:56.620]  We analyzed some deep learning applications and found that they have a hidden hypothesis.
[20:56.620 --> 20:58.300]  They don't usually talk about it.
[20:58.300 --> 21:02.020]  Everyone will pay attention to how great deep learning is and how powerful it is.
[21:02.020 --> 21:05.380]  Of course, what I'm talking about here is mainly to do this kind of picture recognition.
[21:05.380 --> 21:06.760]  It's basically a transfer network.
[21:06.760 --> 21:12.800]  Of course, if he's using this string, it might be different.
[21:12.800 --> 21:17.680]  But I think, from my point of view, I think it's a common problem.
[21:17.680 --> 21:18.960]  What's the problem?
[21:19.980 --> 21:21.800]  I put three examples here today.
[21:21.800 --> 21:24.480]  Let me explain these three examples first, and then I'll share them with you.
[21:24.640 --> 21:28.540]  The top example of the three examples is what we call Unleashed.
[21:28.540 --> 21:29.640]  What is Unleashed?
[21:30.080 --> 21:33.940]  I don't know. I can't see you here. The light is very strong.
[21:34.100 --> 21:39.740]  If, I believe, if you have studied deep learning by yourself,
[21:39.740 --> 21:42.740]  you can play with it yourself, or you can develop it yourself.
[21:42.740 --> 21:45.860]  I believe you must have seen this example of Unleashed.
[21:45.860 --> 21:54.060]  Why? I open any deep learning course on my own.
[21:54.680 --> 21:57.980]  Your example of Hello World is this Unleashed.
[21:57.980 --> 21:59.260]  What does Unleashed do?
[21:59.260 --> 22:00.780]  Take a picture.
[22:01.080 --> 22:03.180]  There are numbers 0 to 9 in the picture.
[22:03.180 --> 22:08.040]  Unleashed helps you recognize the number between 0 and 9 in the picture.
[22:08.740 --> 22:11.300]  This is a very classic example.
[22:11.300 --> 22:17.200]  The middle example I just mentioned is a very famous deep learning picture competition.
[22:17.380 --> 22:20.940]  Professor Li Fei-fei has hosted this for many years.
[22:20.940 --> 22:22.860]  Very successful and very effective.
[22:22.860 --> 22:24.360]  This is the middle example.
[22:24.460 --> 22:30.080]  The third example, I think the first two are more of what you can think of as teaching or scientific research.
[22:30.520 --> 22:36.580]  The third example I put is a real example of deep learning autonomous driving.
[22:36.580 --> 22:38.480]  You all know NVIDIA.
[22:38.560 --> 22:40.180]  NVIDIA's GPU is very powerful.
[22:40.180 --> 22:47.340]  At the same time, in addition to its GPU platform, it also launched an open-source autonomous driving model.
[22:47.760 --> 22:51.300]  In my field, of course, autonomous driving is a complex system.
[22:51.300 --> 22:52.860]  It's not just a bottom-level GPU.
[22:52.860 --> 22:58.380]  It has a lot of deep learning systems, including radar, camera, and many more.
[22:58.380 --> 23:05.060]  Here we are looking at NVIDIA's PS2 model.
[23:05.060 --> 23:07.740]  What it does is that you take a picture.
[23:07.740 --> 23:10.920]  The picture on the road tells you the direction of the decision.
[23:11.120 --> 23:12.840]  What is their commonality?
[23:13.000 --> 23:14.700]  I look at the part on the right of this picture.
[23:15.200 --> 23:17.720]  The part on the right is these practical examples.
[23:17.720 --> 23:19.640]  I told you about the model in the middle.
[23:19.680 --> 23:22.660]  Its model determines your input.
[23:22.660 --> 23:29.280]  When you are training and sorting, when you send it to the model, the input of your picture is fixed in size.
[23:29.760 --> 23:31.420]  For example, handwriting recognition.
[23:31.420 --> 23:34.700]  No matter how big the picture you give it, it has to become 28x28x28 first.
[23:34.700 --> 23:38.040]  In the middle, when I was doing the ImageNet competition,
[23:38.040 --> 23:39.680]  I put three examples in the bottom.
[23:39.680 --> 23:42.140]  They are all top-ranked in Germany.
[23:42.260 --> 23:44.240]  About 1.4, 1.5, 1.6.
[23:44.240 --> 23:45.760]  The first-place example.
[23:45.760 --> 23:48.700]  They are all more than 200x200 pictures.
[23:49.160 --> 23:52.920]  And then, NVIDIA's driver is also a fixed one.
[23:52.920 --> 23:54.260]  About 200x66.
[23:54.820 --> 24:01.180]  The assumption I made is that they have to determine the degree of the input of the model is fixed.
[24:02.380 --> 24:03.800]  That's a natural question.
[24:03.800 --> 24:05.800]  If my input is different from the model,
[24:06.760 --> 24:08.540]  the first person may refute and say,
[24:08.540 --> 24:10.200]  how can it be different?
[24:10.200 --> 24:11.680]  My input is the same as the model.
[24:11.680 --> 24:15.200]  Let me give you the most realistic example among the three.
[24:15.800 --> 24:17.960]  Let's take the example of NVIDIA.
[24:17.960 --> 24:20.220]  The first two examples you think may be research.
[24:20.220 --> 24:21.420]  The example of NVIDIA says,
[24:21.420 --> 24:23.960]  OK, my model is 200x66.
[24:23.980 --> 24:29.020]  NVIDIA itself has its own autonomous driving ecology on the official website.
[24:29.020 --> 24:31.040]  In ecology, including which hardware you should use,
[24:31.040 --> 24:32.140]  what kind of LIDAR you should use,
[24:32.140 --> 24:34.100]  what kind of car you should use, right?
[24:34.180 --> 24:36.880]  NVIDIA provides a camera recommendation manufacturer.
[24:36.880 --> 24:39.280]  I listed three manufacturers here.
[24:39.280 --> 24:40.500]  When these manufacturers check,
[24:40.500 --> 24:43.620]  they will also say which camera is matched with NVIDIA.
[24:45.060 --> 24:47.480]  You can see the output of this camera.
[24:47.800 --> 24:48.920]  I gave you a few examples.
[24:49.260 --> 24:52.440]  A310, 320x240, right?
[24:52.440 --> 24:54.040]  The following table,
[24:54.040 --> 24:56.500]  the color is just a little bit of a combination.
[24:56.500 --> 24:58.360]  But the basic meaning, I hope you understand.
[24:58.360 --> 25:02.580]  There is also 1920x1208, right?
[25:02.880 --> 25:05.400]  The first point I want to emphasize is that
[25:05.400 --> 25:08.880]  none of the input of NVIDIA models is the same.
[25:09.000 --> 25:10.120]  Right?
[25:10.120 --> 25:11.900]  The second one, you may ask,
[25:11.900 --> 25:16.080]  why don't I make the model bigger?
[25:16.400 --> 25:18.340]  Make the input a little bit bigger.
[25:18.340 --> 25:19.280]  Why?
[25:19.600 --> 25:22.040]  Because, as far as I know,
[25:22.040 --> 25:26.520]  if the dimension of your model becomes larger,
[25:26.520 --> 25:27.980]  the number of neurons,
[25:27.980 --> 25:29.780]  the complexity of the whole calculation,
[25:29.780 --> 25:32.460]  should be at least, or square,
[25:32.460 --> 25:33.520]  or I can't remember the number.
[25:33.520 --> 25:36.040]  It's an explosive growth.
[25:36.040 --> 25:39.640]  So, in order to control the ability of your calculation,
[25:40.340 --> 25:41.860]  the need for basic calculation,
[25:41.860 --> 25:44.730]  we are all inclined to use a smaller model.
[25:45.320 --> 25:46.280]  There is another one,
[25:46.280 --> 25:48.360]  why don't we make a longer model?
[25:48.360 --> 25:49.560]  As far as I know,
[25:49.560 --> 25:52.580]  at least in the overall network scene,
[25:52.580 --> 25:54.480]  deep learning experts still don't know
[25:54.480 --> 25:56.580]  how to make a longer model.
[25:56.580 --> 25:58.560]  Now our models are all fixed length.
[25:58.800 --> 26:00.180]  With this premise,
[26:00.180 --> 26:02.300]  I believe everyone can easily understand.
[26:02.300 --> 26:04.540]  We usually see the deep learning model,
[26:04.540 --> 26:05.820]  the whole application,
[26:05.820 --> 26:07.560]  is such a scene.
[26:07.560 --> 26:09.740]  In the middle part,
[26:09.740 --> 26:10.980]  is our model,
[26:10.980 --> 26:13.220]  the fixed input model I mentioned.
[26:13.220 --> 26:14.420]  Whether it's training,
[26:14.420 --> 26:15.920]  or doing classification,
[26:15.920 --> 26:17.420]  or decision-making,
[26:17.420 --> 26:19.520]  it's all fixed-size models.
[26:19.520 --> 26:20.940]  But if your input,
[26:21.500 --> 26:23.360]  let's say I draw a cat with a big face,
[26:23.360 --> 26:24.940]  it's still relatively wide.
[26:25.340 --> 26:26.140]  After inputting,
[26:26.140 --> 26:28.220]  you first need to make a micro-change,
[26:28.220 --> 26:29.720]  to become the same as the model.
[26:30.160 --> 26:31.700]  Then you can do training,
[26:31.700 --> 26:32.860]  and so on.
[26:35.060 --> 26:36.900]  With this premise,
[26:36.900 --> 26:38.260]  let's take a look again.
[26:39.340 --> 26:42.060]  Let me rewind a little bit.
[26:42.960 --> 26:44.800]  After I finished this,
[26:44.800 --> 26:47.140]  I told my students,
[26:47.140 --> 26:48.260]  my students said,
[26:48.260 --> 26:50.220]  what I wrote is an application,
[26:50.220 --> 26:51.700]  I have never used this model.
[26:52.140 --> 26:53.740]  I don't need the scale.
[26:53.740 --> 26:55.980]  What you said doesn't exist in my case.
[26:56.460 --> 26:58.000]  Let me give you some examples.
[26:58.000 --> 27:00.600]  It's true that you may not have used it in the program.
[27:00.600 --> 27:01.800]  For example, the first one I put here
[27:01.800 --> 27:02.800]  is the example of TensorFlow.
[27:03.400 --> 27:04.440]  For the example of TensorFlow,
[27:04.440 --> 27:07.240]  I don't need you to look at the program.
[27:07.240 --> 27:08.420]  You can look at it later.
[27:08.640 --> 27:09.800]  What is this function?
[27:09.940 --> 27:11.660]  I have a tensor.
[27:11.660 --> 27:13.320]  I want to convert data into an image
[27:13.320 --> 27:14.800]  and convert it into a tensor.
[27:15.600 --> 27:16.980]  In the function,
[27:16.980 --> 27:19.060]  if you want to make an image,
[27:19.060 --> 27:20.740]  you can use this function in TensorFlow.
[27:20.820 --> 27:22.680]  The red line in the function is
[27:22.680 --> 27:24.880]  the place of Reset or Scaling.
[27:25.320 --> 27:26.160]  It's in TensorFlow.
[27:26.160 --> 27:28.500]  Maybe you don't have it in your program,
[27:28.500 --> 27:29.680]  so you don't use it.
[27:29.680 --> 27:32.740]  The second example is another commonly used one,
[27:32.740 --> 27:33.640]  Deep Detect.
[27:33.640 --> 27:34.700]  Sorry, everyone,
[27:34.700 --> 27:39.200]  my original picture shows the deformation here,
[27:39.200 --> 27:40.440]  so the color,
[27:40.440 --> 27:42.320]  the highlight part and the red part
[27:42.320 --> 27:43.220]  are not the same.
[27:43.220 --> 27:43.840]  But basically,
[27:43.840 --> 27:44.780]  I want to emphasize that
[27:44.780 --> 27:48.620]  these frames contain the deformation,
[27:48.620 --> 27:50.620]  scaling, and deformation.
[27:52.980 --> 27:54.000]  Let's take a look at
[27:54.000 --> 27:55.020]  how to do the scaling.
[27:56.180 --> 27:58.160]  I don't know if you have an idea.
[27:58.160 --> 28:00.500]  In fact, the algorithm is not complicated.
[28:00.500 --> 28:01.740]  Let me give you a few examples.
[28:01.740 --> 28:03.560]  The first one is also abstract.
[28:03.560 --> 28:04.520]  For example,
[28:04.520 --> 28:08.240]  if I turn a big face into a small face,
[28:08.240 --> 28:08.800]  what should I do?
[28:08.800 --> 28:10.220]  For example,
[28:10.220 --> 28:15.200]  I have a 3x3 frame.
[28:15.200 --> 28:17.060]  If I turn it into a small hair,
[28:17.060 --> 28:19.320]  it becomes a 2x2 frame.
[28:19.320 --> 28:21.760]  What should I do with the deformation of the picture?
[28:22.700 --> 28:24.480]  The core task
[28:24.480 --> 28:26.000]  is to calculate
[28:26.000 --> 28:27.340]  the new value.
[28:27.960 --> 28:28.220]  But
[28:28.220 --> 28:30.440]  for the deformation,
[28:31.300 --> 28:33.360]  I use a concept
[28:33.360 --> 28:35.900]  borrowed from language,
[28:35.900 --> 28:36.620]  which is called meaning.
[28:36.620 --> 28:38.780]  If a picture has meaning,
[28:39.160 --> 28:40.540]  when you do the deformation,
[28:40.540 --> 28:42.860]  shouldn't you change the meaning?
[28:43.440 --> 28:45.260]  How do we do the deformation?
[28:45.320 --> 28:47.000]  For example,
[28:47.000 --> 28:47.660]  let's assume that
[28:47.660 --> 28:50.120]  these four points become one point.
[28:50.180 --> 28:50.820]  The value of this one point
[28:50.820 --> 28:53.320]  is calculated through these four points.
[28:54.360 --> 28:56.420]  Or from my perspective,
[28:56.420 --> 28:58.020]  let's say they guess the number.
[28:58.020 --> 28:59.120]  I had four numbers,
[28:59.120 --> 29:00.400]  and you guessed the number.
[29:00.680 --> 29:02.540]  Of course, they have a lot of algorithms.
[29:02.540 --> 29:04.400]  For a common algorithm,
[29:04.400 --> 29:06.100]  I will give you one or two examples.
[29:06.100 --> 29:07.420]  There is a common algorithm called
[29:07.420 --> 29:08.620]  dual-line algebra,
[29:09.240 --> 29:11.240]  which has a different formula.
[29:11.360 --> 29:12.500]  It means that
[29:12.500 --> 29:14.120]  there is an x2 point in the middle,
[29:14.120 --> 29:15.600]  and there are four points around it.
[29:15.600 --> 29:17.120]  We need to calculate the value
[29:17.120 --> 29:19.440]  of these faces and other points.
[29:20.560 --> 29:22.540]  There is another common algorithm
[29:22.540 --> 29:24.040]  called nearest neighbor.
[29:25.900 --> 29:28.340]  Sorry, I can't show you the picture.
[29:29.360 --> 29:31.160]  You can look at the Chinese version
[29:31.160 --> 29:34.480]  I thought it was a 4x4 picture,
[29:34.480 --> 29:36.120]  but it turned out to be a 22x2 picture.
[29:36.120 --> 29:39.040]  The nearest neighbor algorithm
[29:39.040 --> 29:40.540]  is to choose the point
[29:40.540 --> 29:42.920]  in the upper left corner.
[29:43.120 --> 29:45.300]  These are the common algorithms.
[29:45.300 --> 29:46.120]  Of course, there are other algorithms
[29:46.120 --> 29:47.580]  like cubing,
[29:47.580 --> 29:48.600]  intuitive relation,
[29:48.600 --> 29:50.980]  and other algorithms.
[29:51.080 --> 29:52.840]  These algorithms used to be used
[29:52.840 --> 29:53.720]  to do image recognition
[29:53.720 --> 29:56.240]  and image calculation.
[29:56.540 --> 29:59.160]  Of course, they are now
[29:59.160 --> 30:00.460]  using artificial intelligence experts.
[30:02.000 --> 30:03.540]  In the past two years,
[30:03.540 --> 30:04.280]  I have been in contact with
[30:04.280 --> 30:04.340]  artificial intelligence experts.
[30:04.340 --> 30:05.820]  For example,
[30:05.820 --> 30:06.660]  I am a consultant
[30:07.840 --> 30:08.540]  in 3600.
[30:10.040 --> 30:11.140]  3600 is an artificial intelligence
[30:12.560 --> 30:13.840]  security team.
[30:13.840 --> 30:15.200]  They have very strong
[30:15.200 --> 30:17.800]  artificial intelligence experts.
[30:17.800 --> 30:18.840]  For example,
[30:18.840 --> 30:19.840]  Professor Yan Shicheng
[30:19.840 --> 30:23.040]  and other experts
[30:23.040 --> 30:24.180]  of artificial intelligence.
[30:24.180 --> 30:25.240]  I have been in contact with them.
[30:25.240 --> 30:29.040]  I found that
[30:29.600 --> 30:30.540]  as a security expert
[30:30.540 --> 30:33.560]  and an artificial intelligence expert,
[30:33.560 --> 30:35.360]  they have something in common.
[30:36.300 --> 30:37.340]  Experts in deep learning
[30:39.000 --> 30:41.040]  have a kind heart.
[30:41.820 --> 30:42.680]  They have a kind heart.
[30:42.680 --> 30:43.880]  Their algorithms
[30:43.880 --> 30:45.320]  are very kind.
[30:45.760 --> 30:46.540]  Of course,
[30:46.540 --> 30:47.300]  I am not saying that
[30:47.300 --> 30:48.120]  our security experts
[30:48.120 --> 30:49.100]  are evil.
[30:49.140 --> 30:52.520]  Our security experts
[30:55.140 --> 30:56.480]  are sick.
[30:56.480 --> 30:56.540]  They are sick.
[30:56.820 --> 30:58.660]  We can see that
[30:58.660 --> 31:00.500]  when you input a 4x4,
[31:00.500 --> 31:02.360]  it looks like this.
[31:02.360 --> 31:04.120]  It means that
[31:05.060 --> 31:05.900]  I...
[31:05.900 --> 31:09.960]  I am sorry.
[31:09.960 --> 31:11.620]  After we switched,
[31:11.620 --> 31:14.000]  what I want to say is
[31:14.000 --> 31:14.420]  that
[31:14.960 --> 31:16.940]  I put other things
[31:18.960 --> 31:19.560]  in a safe place.
[31:19.560 --> 31:23.140]  As a security expert,
[31:23.140 --> 31:24.420]  we can see that
[31:24.420 --> 31:26.380]  the result means
[31:26.380 --> 31:27.120]  that the attacker
[31:27.120 --> 31:27.520]  put other things
[31:27.520 --> 31:29.620]  in a safe place.
[31:31.660 --> 31:32.640]  After we switched,
[31:32.640 --> 31:33.820]  we can't see anything.
[31:34.340 --> 31:35.060]  Of course,
[31:35.060 --> 31:38.160]  I will use this concept
[31:38.160 --> 31:38.820]  to explain.
[31:38.820 --> 31:40.040]  I will show you
[31:40.040 --> 31:41.580]  the effect of our attack.
[31:41.580 --> 31:43.100]  I hope that
[31:43.100 --> 31:44.620]  the effect won't disappear.
[31:45.720 --> 31:48.060]  Actually, what I mean is that
[31:48.060 --> 31:49.380]  when I artificially
[31:49.380 --> 31:50.880]  construct some input,
[31:50.880 --> 31:53.380]  I send it to the algorithm.
[31:53.920 --> 31:55.300]  A cat's face
[31:55.300 --> 31:57.460]  may become a dog's face.
[31:58.180 --> 31:59.160]  Because of
[31:59.160 --> 32:01.640]  our data control,
[32:01.640 --> 32:03.220]  we can change
[32:03.220 --> 32:04.840]  the meaning of the picture.
[32:06.480 --> 32:07.400]  The kind
[32:09.000 --> 32:10.360]  algorithm designers
[32:10.360 --> 32:12.380]  usually assume
[32:12.380 --> 32:13.420]  that your input is
[32:13.420 --> 32:14.660]  natural.
[32:14.940 --> 32:18.540]  But the effect
[32:18.540 --> 32:20.560]  is actually
[32:20.560 --> 32:21.700]  different.
[32:21.980 --> 32:24.840]  What we did
[32:24.840 --> 32:25.920]  is that
[32:25.920 --> 32:27.680]  the concept is not complicated.
[32:28.760 --> 32:30.800]  We wrote a tool
[32:30.800 --> 32:31.980]  to generate the attack
[32:31.980 --> 32:33.640]  and I will show you
[32:33.640 --> 32:36.120]  how to generate
[32:36.120 --> 32:37.760]  the effect.
[32:38.000 --> 32:39.860]  First, I will show you
[32:39.860 --> 32:41.880]  the initial structure.
[32:41.880 --> 32:44.120]  If you look at the picture,
[32:44.120 --> 32:46.040]  ignore the icon
[32:46.040 --> 32:48.520]  which is the watermark
[32:48.520 --> 32:48.620]  in the background.
[32:48.620 --> 32:50.740]  It is said that
[32:50.740 --> 32:53.160]  we used a new
[32:53.960 --> 32:54.580]  display in a factory.
[32:55.720 --> 32:57.640]  Ignore the icon.
[32:57.640 --> 33:00.100]  Look at the picture in the middle.
[33:00.340 --> 33:01.420]  You can see
[33:01.420 --> 33:03.120]  the input.
[33:03.120 --> 33:05.340]  If you look at the number,
[33:05.340 --> 33:06.220]  you know
[33:08.040 --> 33:08.660]  what it is.
[33:08.660 --> 33:10.140]  But if I say
[33:10.140 --> 33:13.120]  the number is between 0 and 9,
[33:13.840 --> 33:15.370]  I will choose 7.
[33:15.760 --> 33:17.080]  This is a hint.
[33:17.840 --> 33:18.380]  This is the result
[33:18.380 --> 33:19.620]  after the change.
[33:22.320 --> 33:23.080]  The concept
[33:23.080 --> 33:24.340]  is easy to understand.
[33:24.340 --> 33:26.280]  You see a wide picture.
[33:27.500 --> 33:29.060]  Some information is lost.
[33:29.060 --> 33:31.500]  If we want to remove it,
[33:31.500 --> 33:32.160]  we need to remove the icon.
[33:33.340 --> 33:35.260]  We did other things in the beginning.
[33:35.260 --> 33:36.040]  You don't have to
[33:36.040 --> 33:37.800]  make it narrow.
[33:38.640 --> 33:40.180]  You can make it bigger.
[33:40.180 --> 33:41.660]  If you make it smaller,
[33:41.660 --> 33:45.100]  it will look like this.
[33:46.440 --> 33:47.680]  In the early days,
[33:47.680 --> 33:49.560]  it didn't work well.
[33:49.560 --> 33:50.700]  Look at the picture.
[33:50.700 --> 33:52.220]  There are some animals in the picture.
[33:52.220 --> 33:54.040]  Do you know what they are?
[33:54.360 --> 33:56.360]  If you can see clearly,
[33:56.360 --> 33:57.820]  they are sheep.
[33:57.820 --> 33:59.240]  This is the result
[33:59.820 --> 34:01.420]  after the change.
[34:02.600 --> 34:03.260]  The effect is
[34:03.260 --> 34:05.580]  always the same.
[34:05.580 --> 34:08.320]  If you look closely,
[34:08.320 --> 34:09.480]  you can see the water.
[34:09.480 --> 34:11.440]  This is the picture we made in the early days.
[34:11.440 --> 34:13.480]  The effect is not so good.
[34:13.480 --> 34:15.680]  But the data is hidden in it.
[34:16.080 --> 34:18.020]  How do we generate
[34:18.020 --> 34:20.080]  this automatic attack?
[34:20.320 --> 34:22.520]  I will talk about the process.
[34:22.780 --> 34:24.840]  We don't have time to talk about
[34:24.840 --> 34:24.960]  the specific algorithm.
[34:24.960 --> 34:27.040]  If I want to generate
[34:27.040 --> 34:27.960]  an attack,
[34:27.960 --> 34:30.100]  we need to
[34:30.100 --> 34:31.780]  show the data
[34:33.400 --> 34:34.160]  before the change.
[34:34.660 --> 34:35.600]  It is very simple.
[34:35.600 --> 34:37.660]  It is the change of a circle.
[34:37.660 --> 34:39.180]  The output of a picture
[34:39.180 --> 34:41.800]  after the change of a circle.
[34:41.800 --> 34:45.100]  I will talk about
[34:45.100 --> 34:45.720]  the change of a picture
[34:45.720 --> 34:46.760]  after the change of a circle.
[34:46.940 --> 34:49.060]  If I want to make an attack,
[34:49.060 --> 34:51.840]  for example,
[34:51.840 --> 34:53.360]  the picture on the left
[34:53.360 --> 34:56.500]  is the original picture I want to use.
[34:57.080 --> 34:57.760]  The picture on the right
[34:57.760 --> 34:59.680]  is the attack I want to generate.
[34:59.920 --> 35:01.540]  I want to make the picture.
[35:01.600 --> 35:05.120]  In fact, I just add some interference
[35:05.120 --> 35:06.320]  and generate the middle picture.
[35:06.320 --> 35:08.540]  But after the change of a circle,
[35:08.540 --> 35:11.440]  the middle picture becomes the target picture.
[35:12.340 --> 35:14.240]  It is a very simple math expression.
[35:14.240 --> 35:15.800]  And I want to tell you
[35:15.800 --> 35:16.900]  that it is not difficult to understand.
[35:16.900 --> 35:19.020]  In fact, my triangle,
[35:19.020 --> 35:19.800]  this delta,
[35:19.800 --> 35:22.660]  it actually has many levels.
[35:23.060 --> 35:24.480]  In fact, the simplest way
[35:24.480 --> 35:26.560]  is to stack the original picture on it.
[35:26.760 --> 35:28.680]  In fact, what does our attack do?
[35:29.120 --> 35:30.380]  We want to have
[35:30.380 --> 35:31.980]  several attack scenarios.
[35:32.180 --> 35:34.880]  We want to be the best effect.
[35:34.880 --> 35:35.880]  The best effect,
[35:35.880 --> 35:36.960]  we define,
[35:39.420 --> 35:40.860]  we want to choose
[35:40.860 --> 35:41.700]  in this picture,
[35:41.700 --> 35:43.940]  closer to the man's face
[35:43.940 --> 35:46.780]  closer to the girl's face
[35:46.780 --> 35:47.700]  the original picture
[35:47.700 --> 35:49.900]  is our main scenario.
[35:49.900 --> 35:53.400]  Of course, there are several scenarios
[35:53.400 --> 35:55.060]  The first one, we call it a strong attack scenario.
[35:55.060 --> 35:55.780]  What does it mean?
[35:55.780 --> 35:58.640]  You choose which target you want to attack.
[35:58.640 --> 35:59.640]  Who do I want to become?
[35:59.740 --> 36:03.400]  I also choose who I can input.
[36:03.400 --> 36:04.000]  For example,
[36:04.000 --> 36:07.900]  I want Li Bingbing to become Zhao Banshan.
[36:07.900 --> 36:10.380]  This is called a strong constraint attack.
[36:10.500 --> 36:11.880]  There is another situation.
[36:12.340 --> 36:15.940]  This is called a weak constraint attack.
[36:15.940 --> 36:18.580]  I don't care if you use Li Bingbing
[36:18.580 --> 36:20.660]  or Fan Bingbing or a certain picture.
[36:20.660 --> 36:22.240]  As long as you make him become Zhao Banshan,
[36:22.240 --> 36:23.700]  as long as he looks different from Zhao Banshan,
[36:23.700 --> 36:25.180]  then we call it a successful attack.
[36:25.180 --> 36:28.180]  This is called a weak attack situation.
[36:28.700 --> 36:29.720]  Specifically,
[36:29.720 --> 36:30.900]  the whole solution,
[36:30.900 --> 36:32.820]  the whole solution and optimization,
[36:32.820 --> 36:33.940]  I don't want to do it.
[36:33.940 --> 36:36.040]  Maybe in this scene,
[36:36.040 --> 36:38.640]  some of you may not like it.
[36:38.640 --> 36:41.740]  But our article is already out there.
[36:42.000 --> 36:43.900]  If you are interested,
[36:43.900 --> 36:45.520]  we can share it with you.
[36:45.520 --> 36:47.800]  Let me show you the effect first.
[36:47.840 --> 36:49.060]  The first effect,
[36:49.060 --> 36:50.640]  I want to use this example.
[36:50.800 --> 36:52.700]  In English, it's called Face Off.
[36:52.700 --> 36:54.920]  In fact, people who are older may know
[36:54.920 --> 36:56.560]  that there was a movie in the past.
[36:57.260 --> 36:58.620]  It's a famous movie directed by Bruce Willis.
[36:58.620 --> 36:59.840]  It's called Change Face.
[37:00.280 --> 37:02.000]  The context of Change Face is that
[37:02.000 --> 37:04.340]  Nicholas Cage and John Travolta
[37:04.340 --> 37:05.760]  changed their faces.
[37:05.900 --> 37:07.000]  In fact,
[37:07.000 --> 37:08.280]  we can use our algorithm
[37:08.280 --> 37:12.040]  to change people's faces
[37:12.040 --> 37:12.780]  directly.
[37:13.000 --> 37:14.320]  We put it in our system
[37:14.320 --> 37:15.640]  and directly say that
[37:15.640 --> 37:18.340]  this is the result of the synthesis.
[37:18.340 --> 37:19.960]  And then the result is
[37:19.960 --> 37:21.900]  exactly the same as it should be.
[37:21.900 --> 37:22.800]  Then we made a scene
[37:22.800 --> 37:24.580]  with a very strong constraint.
[37:25.540 --> 37:27.080]  Of course, you can see that
[37:27.080 --> 37:30.360]  because we are an automated production,
[37:30.360 --> 37:31.660]  so you can give me any constraint.
[37:31.660 --> 37:33.260]  Brad Pitt,
[37:33.260 --> 37:34.980]  Robert Downey Jr.
[37:34.980 --> 37:36.260]  Anyone can be Brad Pitt.
[37:36.420 --> 37:38.420]  I can also do a weak attack.
[37:38.420 --> 37:41.160]  If I want to be Brad Pitt's picture,
[37:41.160 --> 37:43.000]  I don't care what I input.
[37:44.180 --> 37:45.660]  A bunch of noise in the front.
[37:45.660 --> 37:46.480]  This bunch of noise
[37:46.480 --> 37:48.780]  is transformed into Brad Pitt's image.
[37:49.500 --> 37:51.380]  This is the first example.
[37:51.380 --> 37:52.580]  I believe everyone understands
[37:52.580 --> 37:54.140]  the scene just now.
[37:54.140 --> 37:55.560]  It's not difficult to understand.
[37:57.280 --> 37:59.260]  Let's talk about the second attack scene.
[37:59.260 --> 38:00.700]  Suppose we have a physical
[38:01.880 --> 38:03.480]  deep learning application.
[38:04.280 --> 38:05.900]  We need to identify the object
[38:05.900 --> 38:06.820]  when we get a picture.
[38:07.460 --> 38:09.160]  Of course, I know that
[38:09.160 --> 38:09.400]  the picture size of the model
[38:10.020 --> 38:11.920]  is not very large.
[38:11.920 --> 38:12.920]  So I gave a big picture
[38:13.700 --> 38:16.400]  to the deep learning framework
[38:16.400 --> 38:18.100]  which often uses scaling
[38:18.100 --> 38:21.620]  to transform it into this picture.
[38:21.820 --> 38:22.720]  These two pictures
[38:22.720 --> 38:25.200]  look almost the same.
[38:25.200 --> 38:26.280]  These two pictures look
[38:26.280 --> 38:27.520]  almost the same.
[38:27.520 --> 38:29.200]  But I have a hint.
[38:29.200 --> 38:32.160]  I draw a circle
[38:32.160 --> 38:34.300]  to show the car.
[38:34.300 --> 38:36.600]  This car exists in the original picture.
[38:37.300 --> 38:38.920]  It should be very clear.
[38:39.300 --> 38:42.740]  It should be very clear.
[38:42.740 --> 38:43.460]  It should be very clear.
[38:44.300 --> 38:47.060]  So this is what we talked about
[38:47.060 --> 38:47.460]  before.
[38:47.840 --> 38:49.740]  We made a complicated object
[38:49.740 --> 38:51.700]  and then drew two circles on it.
[38:52.220 --> 38:54.740]  Because of this kind of
[38:54.740 --> 38:56.040]  wide-scale application,
[38:56.040 --> 38:58.420]  if you get an input,
[38:58.420 --> 38:59.720]  you can do
[39:00.840 --> 39:02.860]  a more perfect attack.
[39:02.940 --> 39:04.700]  Let me give you a third example.
[39:05.080 --> 39:07.340]  I am still in a street scene
[39:07.340 --> 39:08.560]  with a large input.
[39:08.860 --> 39:10.940]  Now I am in a small scene
[39:10.940 --> 39:11.300]  in a convenience store.
[39:11.320 --> 39:12.620]  I also have a hint.
[39:12.620 --> 39:16.660]  Let's see what data
[39:16.660 --> 39:17.600]  is lost in this change.
[39:17.980 --> 39:20.680]  Now I will give you a hint.
[39:20.960 --> 39:22.080]  Pay attention to the position
[39:22.080 --> 39:23.240]  where I draw the red light.
[39:23.620 --> 39:25.020]  In the red light position,
[39:25.020 --> 39:25.940]  in the big picture,
[39:25.940 --> 39:27.760]  it is a forbidden left turn.
[39:27.760 --> 39:30.120]  It is a forbidden left turn
[39:30.120 --> 39:30.660]  in a traffic sign.
[39:30.680 --> 39:32.040]  After the change,
[39:32.780 --> 39:35.140]  it becomes a allowed left turn.
[39:35.480 --> 39:38.320]  So all these object recognition
[39:38.320 --> 39:40.100]  or traffic sign recognition,
[39:40.100 --> 39:42.000]  if there is such a problem,
[39:42.000 --> 39:44.400]  I believe you can easily understand
[39:44.400 --> 39:46.960]  that in the deep learning system,
[39:46.960 --> 39:48.060]  if you don't consider
[39:48.980 --> 39:51.980]  the problems you have to deal with,
[39:51.980 --> 39:53.160]  or the framework,
[39:53.160 --> 39:57.120]  you are a red cancer from the image.
[39:57.480 --> 39:59.060]  There will be such a problem.
[39:59.420 --> 40:00.220]  Of course,
[40:00.220 --> 40:01.480]  I think a wider situation
[40:01.480 --> 40:03.580]  is called data poisoning.
[40:04.100 --> 40:05.260]  The scene of data poisoning
[40:05.260 --> 40:08.240]  is actually in the study of
[40:08.240 --> 40:09.060]  deep learning security
[40:09.060 --> 40:10.560]  or artificial intelligence security.
[40:10.600 --> 40:11.720]  It should be mentioned that
[40:11.720 --> 40:14.700]  it is actually a very wide scene.
[40:15.260 --> 40:17.000]  The scene of data poisoning
[40:17.000 --> 40:19.140]  means that
[40:19.140 --> 40:21.560]  it means
[40:22.380 --> 40:23.760]  when you are doing
[40:24.140 --> 40:25.660]  deep learning training,
[40:26.760 --> 40:28.800]  if the label is wrong,
[40:28.800 --> 40:30.380]  it will cause your result
[40:30.380 --> 40:31.620]  or some information to be hidden
[40:31.620 --> 40:33.580]  or the result will be wrong.
[40:33.580 --> 40:34.360]  But in the past,
[40:34.360 --> 40:35.080]  people were worried about
[40:35.080 --> 40:36.520]  data poisoning.
[40:36.900 --> 40:39.080]  In the past,
[40:39.080 --> 40:40.440]  there were two scenarios.
[40:40.440 --> 40:42.460]  One scenario is that
[40:42.460 --> 40:43.920]  the trainee accidentally
[40:43.920 --> 40:51.540]  marked your data
[40:53.480 --> 40:54.720]  wrongly.
[40:54.920 --> 40:55.860]  Or the trainee
[40:55.860 --> 40:56.700]  and the attacker
[40:56.700 --> 40:59.140]  have a common goal.
[40:59.860 --> 41:00.520]  Then,
[41:00.520 --> 41:03.380]  my kind friends
[41:03.380 --> 41:04.640]  think that
[41:05.290 --> 41:07.260]  there is a risk of data poisoning.
[41:07.260 --> 41:08.620]  But basically,
[41:08.620 --> 41:09.460]  it can be ignored.
[41:09.460 --> 41:10.400]  In fact,
[41:10.400 --> 41:11.280]  we want to say that
[41:11.280 --> 41:12.600]  after the data change,
[41:12.600 --> 41:14.000]  it has a great impact.
[41:18.100 --> 41:19.460]  In fact,
[41:21.180 --> 41:22.420]  the biggest risk
[41:22.420 --> 41:23.860]  is the data poisoning.
[41:23.860 --> 41:24.320]  For example,
[41:24.320 --> 41:25.240]  if I put a little sheep
[41:25.240 --> 41:25.740]  here,
[41:25.740 --> 41:27.220]  it will turn into a cat.
[41:27.220 --> 41:30.460]  If I put a new sheep
[41:30.460 --> 41:32.420]  it will turn into a cat.
[41:32.580 --> 41:33.620]  Kind people
[41:36.320 --> 41:38.120]  like sheep.
[41:38.640 --> 41:40.080]  Next is the
[41:40.080 --> 41:41.940]  image of sheep.
[41:41.940 --> 41:43.720]  I can't see the sheep.
[41:43.720 --> 41:44.680]  Of course,
[41:44.680 --> 41:45.540]  you have to look carefully.
[41:45.540 --> 41:47.660]  Maybe it will turn into a cat.
[41:47.940 --> 41:49.940]  But after the data change,
[41:49.940 --> 41:51.400]  it will turn into a cat.
[41:51.400 --> 41:53.480]  It is like a cat
[41:53.480 --> 41:56.580]  with a missing leg.
[41:57.720 --> 42:00.980]  It is a cat with a missing leg.
[42:00.980 --> 42:01.240]  It is a cat with a missing leg.
[42:01.240 --> 42:02.820]  These examples
[42:02.820 --> 42:04.340]  are to tell you that
[42:04.340 --> 42:06.120]  we can use these methods
[42:06.120 --> 42:08.940]  to cause data poisoning.
[42:10.400 --> 42:11.820]  How to prevent it?
[42:13.460 --> 42:15.100]  It is very difficult
[42:15.100 --> 42:15.420]  to do safety research.
[42:15.420 --> 42:16.840]  Because you have to attack
[42:16.840 --> 42:18.660]  at the same time,
[42:18.660 --> 42:21.080]  you have to provide
[42:21.260 --> 42:22.360]  a way to prevent it.
[42:22.500 --> 42:24.600]  So I will tell you
[42:24.600 --> 42:25.880]  the basic idea.
[42:25.880 --> 42:27.500]  The basic idea is very simple.
[42:27.500 --> 42:28.220]  First of all,
[42:28.220 --> 42:29.160]  if your app says
[42:29.160 --> 42:32.520]  that the data you give me
[42:32.520 --> 42:33.600]  is different from the model size,
[42:33.600 --> 42:34.760]  it is not necessary.
[42:36.020 --> 42:37.420]  It is possible that
[42:37.420 --> 42:38.380]  there is such a situation.
[42:38.380 --> 42:40.040]  Many times I am sure that
[42:40.040 --> 42:42.300]  I just want to see 28x28
[42:42.300 --> 42:44.280]  or my data input.
[42:44.280 --> 42:45.260]  I just think so.
[42:45.260 --> 42:46.860]  But for many applications,
[42:46.860 --> 42:48.660]  especially if it is a picture collection
[42:48.660 --> 42:49.280]  application,
[42:49.280 --> 42:54.240]  I think it is difficult
[42:54.240 --> 42:56.000]  to limit the size of the picture
[42:56.000 --> 42:56.120]  if you want to upload it to each other.
[42:56.400 --> 42:58.620]  Then there is a second scenario.
[42:58.620 --> 43:00.160]  I only get my data
[43:00.160 --> 43:01.760]  from my specified
[43:01.760 --> 43:03.200]  sensor.
[43:04.440 --> 43:05.720]  If it is not my sensor,
[43:05.720 --> 43:07.260]  it is not good for me.
[43:07.560 --> 43:09.000]  And I have to make sure
[43:09.000 --> 43:11.200]  that my sensor is fine.
[43:11.200 --> 43:12.500]  But at this time,
[43:12.500 --> 43:14.160]  I think there are many applications
[43:14.160 --> 43:16.280]  that can do this.
[43:16.280 --> 43:18.620]  I can only do this
[43:18.620 --> 43:22.000]  to a specific sensor.
[43:22.560 --> 43:23.640]  Of course,
[43:23.640 --> 43:26.620]  it does not mean that
[43:26.620 --> 43:27.520]  you have to do this
[43:27.520 --> 43:27.900]  every time.
[43:27.900 --> 43:28.820]  I just used the example of
[43:28.820 --> 43:29.960]  NVIDIA.
[43:29.980 --> 43:32.480]  It does not mean that
[43:32.480 --> 43:33.360]  you have to do this
[43:33.360 --> 43:34.440]  every time.
[43:35.060 --> 43:36.340]  Second,
[43:36.340 --> 43:40.840]  even if you only get
[43:40.840 --> 43:41.640]  from a specific
[43:41.640 --> 43:43.300]  sensor,
[43:43.300 --> 43:44.780]  there are still some
[43:44.780 --> 43:45.060]  attacks,
[43:45.060 --> 43:46.160]  including when you transmit,
[43:46.160 --> 43:48.180]  and your sensor itself.
[43:48.260 --> 43:49.940]  This is a precaution.
[43:51.160 --> 43:52.640]  Third,
[43:52.640 --> 43:54.280]  we also did something.
[43:54.520 --> 43:55.520]  I used to say that
[43:55.520 --> 43:56.860]  people who are good at math
[43:57.740 --> 43:59.480]  are kind.
[43:59.840 --> 44:02.060]  Let's make it up.
[44:03.340 --> 44:04.980]  Let's make it up.
[44:07.080 --> 44:09.200]  Let's make it up.
[44:09.200 --> 44:10.360]  Let's make it up.
[44:10.360 --> 44:13.200]  Let's make it up.
[44:13.200 --> 44:14.720]  Let's make it up.
[44:14.720 --> 44:16.560]  This is what we did.
[44:16.560 --> 44:19.520]  We did some short experiments.
[44:19.540 --> 44:20.620]  The basic idea is this.
[44:20.620 --> 44:21.520]  For example,
[44:21.520 --> 44:23.440]  when you make a picture,
[44:23.440 --> 44:25.260]  you check whether the content has changed or not.
[44:25.260 --> 44:26.500]  Of course, you cannot use another
[44:26.500 --> 44:27.820]  deep learning system to check
[44:27.820 --> 44:28.860]  whether the content has changed or not.
[44:28.860 --> 44:32.720]  So we decided to
[44:32.720 --> 44:35.840]  only observe the characteristics
[44:35.840 --> 44:36.180]  of the image itself.
[44:36.180 --> 44:36.820]  The characteristics of the image itself
[44:38.040 --> 44:40.480]  include the use of colors,
[44:40.480 --> 44:42.940]  how many times each color has been used,
[44:42.940 --> 44:43.520]  and
[44:44.160 --> 44:45.680]  each specific color
[44:45.680 --> 44:47.340]  in the picture.
[44:48.300 --> 44:49.260]  For example,
[44:49.260 --> 44:51.460]  whether black is concentrated at one point
[44:51.460 --> 44:53.720]  or scattered all over the picture.
[44:53.720 --> 44:55.340]  This can also be measured.
[44:55.340 --> 44:56.880]  The basic idea is that
[44:56.880 --> 44:58.360]  if a normal picture,
[44:58.360 --> 44:59.100]  for example,
[44:59.100 --> 45:00.500]  after deformation,
[45:00.500 --> 45:02.620]  should be the same as the original,
[45:02.620 --> 45:04.060]  then it should protect
[45:04.060 --> 45:06.440]  the color ratio
[45:07.260 --> 45:09.800]  and the color density
[45:09.800 --> 45:10.540]  of the original image.
[45:10.660 --> 45:13.400]  Let me give you a few simple examples.
[45:13.400 --> 45:13.920]  The first one is
[45:13.920 --> 45:15.080]  when we transform the sheep
[45:16.060 --> 45:17.220]  into a wolf,
[45:17.220 --> 45:20.300]  we put these three pictures
[45:20.300 --> 45:21.480]  in the picture
[45:21.480 --> 45:24.380]  to measure the color density.
[45:24.440 --> 45:25.400]  You will find that
[45:26.060 --> 45:28.860]  the red and green curves
[45:28.860 --> 45:33.800]  are basically similar.
[45:33.800 --> 45:34.040]  The red and green curves
[45:34.040 --> 45:34.440]  are basically the same,
[45:34.440 --> 45:35.160]  but the blue curve
[45:35.160 --> 45:36.260]  is different.
[45:38.000 --> 45:39.480]  If the original picture
[45:39.480 --> 45:39.980]  is like this,
[45:39.980 --> 45:41.700]  you will find that
[45:41.700 --> 45:43.920]  the blue and red curves
[45:43.920 --> 45:45.340]  are basically the same.
[45:45.340 --> 45:45.840]  Basically,
[45:45.840 --> 45:46.300]  we can use this example
[45:46.300 --> 45:47.140]  to show that
[45:56.660 --> 45:59.040]  there is a huge difference
[45:59.040 --> 45:59.420]  between the original picture
[45:59.420 --> 46:00.120]  and the new picture.
[46:00.120 --> 46:02.740]  This is a very simple example
[46:02.740 --> 46:04.260]  to show that
[46:04.260 --> 46:07.340]  it is possible to detect the difference.
[46:07.340 --> 46:08.400]  In conclusion,
[46:09.640 --> 46:13.100]  I would like to emphasize
[46:13.100 --> 46:13.420]  that
[46:13.420 --> 46:14.500]  there may be
[46:14.500 --> 46:17.020]  other effects.
[46:17.020 --> 46:17.940]  We can see that
[46:17.940 --> 46:18.700]  deep learning
[46:18.700 --> 46:23.260]  focuses on the ability
[46:23.260 --> 46:27.420]  and process
[46:27.420 --> 46:29.520]  of the data
[46:29.520 --> 46:31.740]  and the data chain.
[46:31.740 --> 46:32.240]  However,
[46:32.240 --> 46:32.960]  we should not forget
[46:32.960 --> 46:35.560]  that the model
[46:35.560 --> 46:38.300]  is fixed in size.
[46:38.300 --> 46:39.360]  At the same time,
[46:39.360 --> 46:41.040]  I would like to emphasize
[46:41.040 --> 46:42.780]  that we engage in
[46:42.780 --> 46:44.660]  classical security.
[46:45.020 --> 46:45.980]  Although we are old,
[46:45.980 --> 46:47.020]  we still need to think about
[46:47.020 --> 46:48.380]  the details and
[46:48.380 --> 46:51.460]  find the assumption.
[46:53.480 --> 46:54.160]  We need to build
[46:54.160 --> 46:55.560]  according to the assumption.
[46:56.500 --> 47:00.160]  This is the classical thinking.
[47:00.160 --> 47:00.360]  However,
[47:00.360 --> 47:01.300]  it is still applied
[47:01.300 --> 47:02.880]  in the current scenario.
[47:02.880 --> 47:04.720]  This morning,
[47:04.720 --> 47:08.040]  I pointed out that
[47:08.040 --> 47:10.180]  if we want to see
[47:10.180 --> 47:11.400]  the safety of a program,
[47:11.400 --> 47:12.080]  we need to look at
[47:12.080 --> 47:13.520]  the assumptions
[47:13.520 --> 47:14.200]  it made.
[47:15.700 --> 47:16.520]  There are many
[47:16.520 --> 47:17.020]  places that
[47:17.020 --> 47:17.840]  were not safe to
[47:17.840 --> 47:19.160]  use the data.
[47:19.160 --> 47:20.520]  The second thing is
[47:20.520 --> 47:21.740]  that we need to
[47:21.740 --> 47:22.740]  look at the results
[47:22.740 --> 47:25.480]  of the data
[47:25.480 --> 47:26.680]  processing.
[47:32.060 --> 47:33.620]  Not only
[47:33.620 --> 47:36.620]  in the data processing,
[47:36.620 --> 47:37.640]  but we need
[47:37.640 --> 47:38.440]  the data.
[47:39.240 --> 47:41.280]  I often use the word sampling.
[47:41.280 --> 47:42.220]  When I use the word sampling,
[47:42.220 --> 47:44.300]  any information that should not be used
[47:44.300 --> 47:46.720]  or should not be used
[47:46.720 --> 47:48.080]  can be taken down.
[47:48.080 --> 47:51.200]  This is a famous example.
[47:51.200 --> 47:52.940]  Let me think about the sound.
[47:53.220 --> 47:54.660]  Mr. Qiu Dalu and Mr. Xie Yunan
[47:54.660 --> 47:56.400]  made the sound of seagulls.
[47:56.400 --> 47:57.820]  In fact, in a sense,
[47:57.820 --> 48:00.080]  it is also a sound that can be made again.
[48:00.580 --> 48:02.580]  The third thing I want to say
[48:02.580 --> 48:04.060]  is that I often emphasize
[48:04.060 --> 48:05.900]  that I am not deliberately
[48:05.900 --> 48:07.260]  attacking the students
[48:07.260 --> 48:08.260]  who are studying for their master's degree.
[48:10.580 --> 48:12.060]  In the past,
[48:12.060 --> 48:13.820]  they were very careful about these calculations,
[48:13.820 --> 48:15.300]  but they really did not
[48:16.580 --> 48:18.840]  consider the malicious input.
[48:19.180 --> 48:20.740]  Because of these malicious inputs,
[48:20.740 --> 48:21.600]  I will use the word
[48:21.600 --> 48:23.680]  that I give a picture
[48:23.680 --> 48:29.460]  with a negative result.
[48:29.800 --> 48:31.060]  The above is basically the summary
[48:31.060 --> 48:32.680]  of our performance today.
[48:32.680 --> 48:34.820]  Finally, I want to advertise that
[48:34.820 --> 48:37.000]  some of the work we have done,
[48:37.000 --> 48:38.980]  which was attacked by the framework
[48:38.980 --> 48:40.880]  in the past,
[48:41.940 --> 48:44.080]  will be released at the end of May
[48:44.080 --> 48:47.240]  at the IEEE Security and Privacy
[48:47.240 --> 48:48.760]  Conference.
[48:48.940 --> 48:50.040]  We will hold
[48:50.860 --> 48:51.980]  a first-year
[48:51.980 --> 48:53.480]  deep learning
[48:53.480 --> 48:54.520]  and security
[48:55.600 --> 48:56.440]  workshop
[48:57.000 --> 48:58.960]  in San Francisco.
[48:58.960 --> 49:01.260]  Our work will be published there,
[49:01.260 --> 49:02.580]  and there are many other works
[49:02.580 --> 49:03.540]  on it.
[49:03.540 --> 49:04.360]  If you have any questions,
[49:04.360 --> 49:05.640]  please feel free to contact us.
[49:05.640 --> 49:07.480]  I will also leave my email address
[49:07.480 --> 49:08.680]  on it.
[49:08.680 --> 49:10.320]  If you have any questions,
[49:10.320 --> 49:11.880]  please contact us online.
